Source https://cedricscherer.netlify.app/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/
chic <- readr::read_csv("https://raw.githubusercontent.com/Z3tt/R-Tutorials/master/ggplot2/chicago-nmmaps.csv")##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## city = col_character(),
## date = col_date(format = ""),
## death = col_double(),
## temp = col_double(),
## dewpoint = col_double(),
## pm10 = col_double(),
## o3 = col_double(),
## time = col_double(),
## season = col_character(),
## year = col_double()
## )
## Rows: 1,461
## Columns: 10
## $ city <chr> "chic", "chic", "chic", "chic", "chic", "chic", "chic", "chi…
## $ date <date> 1997-01-01, 1997-01-02, 1997-01-03, 1997-01-04, 1997-01-05,…
## $ death <dbl> 137, 123, 127, 146, 102, 127, 116, 118, 148, 121, 110, 127, …
## $ temp <dbl> 36.0, 45.0, 40.0, 51.5, 27.0, 17.0, 16.0, 19.0, 26.0, 16.0, …
## $ dewpoint <dbl> 37.500, 47.250, 38.000, 45.500, 11.250, 5.750, 7.000, 17.750…
## $ pm10 <dbl> 13.052268, 41.948600, 27.041751, 25.072573, 15.343121, 9.364…
## $ o3 <dbl> 5.659256, 5.525417, 6.288548, 7.537758, 20.760798, 14.940874…
## $ time <dbl> 3654, 3655, 3656, 3657, 3658, 3659, 3660, 3661, 3662, 3663, …
## $ season <chr> "Winter", "Winter", "Winter", "Winter", "Winter", "Winter", …
## $ year <dbl> 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, …
## # A tibble: 10 x 10
## city date death temp dewpoint pm10 o3 time season year
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 chic 1997-01-01 137 36 37.5 13.1 5.66 3654 Winter 1997
## 2 chic 1997-01-02 123 45 47.2 41.9 5.53 3655 Winter 1997
## 3 chic 1997-01-03 127 40 38 27.0 6.29 3656 Winter 1997
## 4 chic 1997-01-04 146 51.5 45.5 25.1 7.54 3657 Winter 1997
## 5 chic 1997-01-05 102 27 11.2 15.3 20.8 3658 Winter 1997
## 6 chic 1997-01-06 127 17 5.75 9.36 14.9 3659 Winter 1997
## 7 chic 1997-01-07 116 16 7 20.2 11.9 3660 Winter 1997
## 8 chic 1997-01-08 118 19 17.8 33.1 8.68 3661 Winter 1997
## 9 chic 1997-01-09 148 26 24 12.1 13.4 3662 Winter 1997
## 10 chic 1997-01-10 121 16 5.38 24.8 10.4 3663 Winter 1997
##library(ggplot2)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.4 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
We specify the data outside aes() and add the variables that ggplot maps the aesthetics to inside aes()
geom_line() to create a line plot (not optimal though):Within the geom_* , you can manipulate visual aesthetics such as the color, shape, and size of your points
Each geom comes with its own properties (called arguments) and the same argument may result in a different change depending on the geom you are using.
g + geom_point(color = "firebrick", shape = "diamond", size = 2) +
geom_line(color = "firebrick", linetype = "dotted", size = .3)And to illustrate some more of ggplot’s versatility, let’s get rid of the grayish default {ggplot2} look by setting a different built-in theme, e.g. theme_bw() —by calling theme_set() all following plots will have the same black’n’white theme. The red points look way better now!
theme() is an essential command to manually modify all kinds of theme elements (texts, rectangles, and lines).
the labs() command provides a character string for each label we want to change (here x and y):
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (ºF)")** 💁 You can also add each axis title via xlab() and ylab()** Example:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
xlab("Year") +
ylab("Temperature (ºF")The code below also allows to add not only symbols but e.g. superscripts (with the use of ^):
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = expression(paste("Temperature (", degree ~ F, ")"^"(Hey, why should we use metric units?!)")))We can change the properties of all or particular text elements (here axis titles) by overwriting the default element_text() within the theme() call:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.title.x = element_text(vjust = 0, size = 15),
axis.title.y = element_text(vjust = 2, size = 15))the vjust command refers to the vertical alignment, which usually ranges between 0 and 1, but you can also specify values outside that range
vjust (which is correct form the label’s perspective)
but you can also change the distance by specifying the margin of both text elements:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.title.x = element_text(margin = margin(t = 10), size = 15),
axis.title.y = element_text(margin = margin(r = 10), size = 15))The labels t and r within the margin() object refer to top and right
💡 A good way to remember the order of the margin sides is “t-r-oub-l-e”.
Within the element_text() we can for example overwrite the defaults for size, color, and face:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (ºF)") +
theme(axis.title = element_text(size = 15, color = "firebrick", face = "italic"))The face argument can be used to make the font bold or italic or even bold.italic.
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.title.x = element_text(color = "sienna", size = 15),
axis.title.y = element_text(color = "orangered", size = 15))💁 You could also use a combination of axis.title and axis.title.y, since axis.title.x inherits the values from axis.title. Example:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (ºF)") +
theme(axis.title = element_text(color = "sienna", size = 15),
axis.title.y = element_text(color = "orangered", size = 15))One can modify some properties for both axis titles and other only for one or properties for each on its own:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (ºF)") +
theme(axis.title = element_text(color = "sienna", size = 15, face = "bold"), axis.title.y = element_text(face = "bold.italic"))You can also change the appearance of the axis text (here the numbers) by using axis.textand/or the subordinated elements axis.text.x and axis.text.y:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.text = element_text(color = "dodgerblue", size = 12),axis.text.x = element_text(face = "italic")) ## Rotate Axis Text
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.text.x = element_text(angle = 50, vjust = 1, hjust = 1, size = 12))ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.ticks.y = element_blank(),
axis.text.y = element_blank())
element_blank() removes the element (and thus is not considered an official element)
💡 Use it if you want to get rid of a theme element
We could again use theme_blank() but it is way simpler to just remove the label in the labs() (or xlab()) call:
💡 Note that NULL removes the element (similarly to element_blank()) while empty quotes "" will keep the spacing for the axis title and simply print nothing.
You can ZOOM IN instead of subsetting your data:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
ylim(c(0, 50))## Warning: Removed 777 rows containing missing values (geom_point).
You can also use scale_y_continuous(limits = c(0, 50)) or coord_cartesian(ylim = c(0, 50)). The former removes all data points outside the range while the second adjusts the visible area and is similar to ylim(c(0, 50)). You may wonder: So in the end both result in the same. But not really, there is an important difference—compare the two following plots:
You might have spotted that on the left there is some empty buffer around your y limits while on the right points are plotted right up to the border and even beyond. This perfectly illustrates the subsetting (left) versus the zooming (right). To show why this is important let’s have a look at a different chart type, a box plot:
… Because scale_x|y_continuous() subsets the data first, we get completely different (and wrong, at least if in the case this was not your aim) estimates for the box plots! I hope you don’t have to go back to your old scripts now and check if you maybe have manipulated your data while plotting and did report wrong summary stats in your report, paper or thesis…
You can also force R to plot the graph starting at the origin:
library(tidyverse)
chic_high <- dplyr::filter(chic, temp > 25, o3 > 20)
ggplot(chic_high, aes(x = temp, y = o3)) +
geom_point(color = "darkcyan") +
labs(x = "Temperature higher than 25°F",
y = "Ozone higher than 20 ppb") +
expand_limits(x = 0, y = 0) 💁 Using
coord_cartesian(xlim = c(0, NA), ylim = c(0, NA)) will lead to the same result:
library(tidyverse)
chic_high <- dplyr::filter(chic, temp > 25, o3 > 20)
ggplot(chic_high, aes(x = temp, y = o3)) +
geom_point(color = "darkcyan") +
labs(x = "Temperature higher than 25°F",
y = "Ozone higher than 20 ppb") +
coord_cartesian(xlim = c(0, NA), ylim = c(0, NA))But again, we can force it to literally start plotting from the origin
ggplot(chic_high, aes(x = temp, y = o3)) +
geom_point(color = "darkcyan") +
labs(x = "Temperature higher than 25°F",
y = "Ozone higher than 20 ppb") +
expand_limits(x = 0, y = 0) +
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
coord_cartesian(clip = "off") 💡 The argument clip = “off” in any coordinate system, always starting with coord_*, allows to draw outside of the panel area.
Here, let’s plot temperature against temperature with some random noise.
The coord_equal() is a coordinate system with a specified ratio representing the number of units on the y-axis equivalent to one unit on the x-axis. The default, ratio = 1, ensures that one unit on the x-axis is the same length as one unit on the y-axis:
ggplot(chic, aes(x = temp, y = temp + rnorm(nrow(chic), sd = 20))) +
geom_point(color = "sienna") +
labs(x = "Temperature (°F)", y = "Temperature (°F) + random noise") +
xlim(c(0, 100)) + ylim(c(0, 150)) +
coord_fixed()## Warning: Removed 52 rows containing missing values (geom_point).
Ratios higher than one make units on the y axis longer than units on the x-axis, and vice versa:
ggplot(chic, aes(x = temp, y = temp + rnorm(nrow(chic), sd = 20))) +
geom_point(color = "sienna") +
labs(x = "Temperature (ºF)", y = "Temperature (ºF) + random noise") +
xlim(c(0,100)) + ylim(c(0, 150)) +
coord_fixed(ratio = 1/5)## Warning: Removed 52 rows containing missing values (geom_point).
Sometimes it is handy to alter your labels a little, perhaps adding units or percent signs without adding them to your data. You can use a function in this case:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = NULL)+
scale_y_continuous(label = function(x) {return(paste(x, "Degrees Fahrenheit"))})We can add a title with the ggtitle() function:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (ºF)") +
ggtitle("Temperatures in Chicago") Alternatively, you can use labs(). Here you can add several arguments, e.g. additionally a subtitle, a caption and a tag (as well as axis titles as shown before):
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = " Temperature (ºF)",
title = "Temperatures in Chicago",
subtitle = "Seasonal pattern of daily temperatures form 1997 to 2001",
caption = "Data: NMMAPS",
tag = "Fig.1")Again, since we want to modify the properties of a theme element, we use the theme() function and as for the text elements axis.title and axis.text modify the font face and the margin. All the following modifications of theme elements work not only for the title but for all other labels such as plot.subtitle, plot.caption, plot.caption, legend.title, legend.text, and axis.title and axis.text.
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)",
title = "Temperatures in Chicago") +
theme(plot.title = element_text(face = "bold",
margin = margin(10, 0, 10, 0),
size = 14)) 💡 A nice way to remember the order of the margin arguments is “t-r-oub-l-e” that resembles the first letter of the four sides.
The general alignment (left, center, right) is controlled by hjust (which stands for horizontal adjustment):
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = NULL,
title = "Temperatures in Chicago",
caption = "Data: NMMAPS") +
theme(plot.title = element_text(hjust = 1, size = 16, face = "bold.italic"))You can also adjust the vertical alignment, with vjust.
You can specify the alignment of the title, subtitle, and caption either based on the panel area (the default) or the plot margin via plot.title.position and plot.caption.position. The later is actually the better choice designwise in most cases and many people were very happy about that new feature since especially with very long y axis labels the alignment looks awful:
(g <- ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
scale_y_continuous(label = function(x) {return(paste(x, "Degrees Fahrenheit"))}) +
labs(x = "Year", y = NULL,
title = "Temperatures in Chicago between 1997 and 2001 in Degrees Fahrenheit",
caption = "Data: NMMAPS") +
theme(plot.title = element_text(size = 14, face = "bold.italic"),
plot.caption = element_text(hjust = 0))) ## Use a Non-Traditional Font in Your Title
You can also use different fonts not only the default one provided by ggplot (and which differs between operating systems). There are several packages that help you to use fonts which are installed on your machine (and you may be using in your office program). Here, I use the showtext package that makes it easy to use various types of fonts (TrueType, OpenType, Type 1, web fonts, etc.) in R plots. After we have loaded the package, you need to import the font that has to be installed on your device as well. I regularly use Google fonts that can be imported with the function font_add_google() but you can also add other fonts with font_add(). (Note that even in case of using Google fonts you must install the font—and restart Rstudio—to use the font.)
## Loading required package: sysfonts
## Loading required package: showtextdb
font_add_google("Playfair Display", ## name of Google font
"Playfair") ## name that will be used in R
font_add_google("Bangers", "Bangers")Now, we can use those font families using: …… theme():
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)",
title = "Temperatures in Chicago",
subtitle = "Daily temperatures in °F from 1997 to 2001") +
theme(plot.title = element_text(family = "Bangers", hjust = .5, size = 25),
plot.subtitle = element_text(family = "Playfair", hjust = .5, size = 15))You can use the lineheight argument to change the spacing between lines. In this example, I have squished the lines together (lineheight < 1).
We will color code the plot based on season. Or to phrase it in a more ggplot’ish way: we map the variable season to the aesthetic color.
One nice thing about {ggplot2} is that it adds a legend by default when mapping a variable to an aesthetic. You can see that by default the legend title is what we specified in the color argument:
ggplot(chic,
aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)")Always one of the first question is: “How can I get rid of the legend?”.
It is quite easy and always works with theme(legend.position = "none"):
ggplot(chic,
aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
theme(legend.position = "none")You can also use guides(color = "none") or scale_color_discrete(guide = "none") depending on the specific case. While the change of the theme element removes all legends at once, you can remove particular legends with the latter options while keeping some others:
ggplot(chic,
aes(x = date, y = temp,
color = season, shape = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
guides(color = "none")Here, for example, we keep the legend for the shapes while discarding the one for the colors.
As we already learned, use element_blank() to draw nothing:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
theme(legend.title = element_blank())💁 You can achieve the same by setting the legend name to NULL, either via scale_color_discrete(name = NULL) or labs(color = NULL).
If you want to place the legend not on the right, you can use legend.position as argument in theme. Possible positions are “top”, “right” (which is the default), “bottom”, and “left”.
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
theme(legend.position = "top")You can also place the legend inside the panel by specifying a vector with relative x and y coordinates ranging from 0 (left or bottom) to 1 (right or top):
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)",
color = NULL) +
theme(legend.position = c(.15, .15),
legend.background = element_rect(fill = "transparent"))Above I also overwrite the default white legend background with a transparent fill to make sure the legend does not hide any data points.
As you have seen, the legend direction is by default vertical but horizontal when you choose either the “top” or “bottom” position. But you can also switch the direction as you like:
You can change the appearance of the legend title by adjusting the theme element legend.title:
The easiest way to change the title of the legend is the labs() layer:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)",
color = "Seasons\nindicated\nby colors:") +
theme(legend.title = element_text(family = "Playfair",
color = "chocolate",
size = 14, face = "bold"))The legend details can be changed via scale_color_discrete(name = "title") or guides(color = guide_legend("title")):
We can achieve this by changing the levels of season:
We are going to replace the seasons by the months which they are covering by providing a vector of names in the scale_color_discrete() call:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
scale_color_discrete("Seasons:", labels = c("Mar—May", "Jun—Aug",
"Sep—Nov", "Dec—Feb")) +
theme(legend.title = element_text(family = "Playfair",
color = "chocolate",
size = 14, face = 2))To change the background color (fill) of the legend keys, we adjust the setting for the theme element legend.key:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
theme(legend.key = element_rect(fill = "darkgoldenrod1"),
legend.title = element_text(family = "Playfair",
color = "chocolate",
size = 14, face = 2)) +
scale_color_discrete("Seasons:")If you want to get rid of them entirely use fill = NA or fill = "transparent".
Points in the legend can get a little lost with the default size, especially without the boxes. To override the default one uses again the guides layer like this:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
theme(legend.key = element_rect(fill = NA),
legend.title = element_text(color = "chocolate",
size = 14, face = 2)) +
scale_color_discrete("Seasons:") +
guides(color = guide_legend(override.aes = list(size = 6)))Let’s say you have two different geoms mapped to the same variable. For example, color as an aesthetic for both a point layer and a rug layer of the same data. By default, both the points and the “line” end up in the legend like this:
ggplot(chic, aes(x = date, y = temp, color = season)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)") +
geom_rug()You can use show.legend = FALSE to turn off a layer in the legend:
{ggplot2} will not add a legend automatically unless you map aesthetics (color, size etc.) to a variable. There are times, though, that I want to have a legend so that it is clear what you are plotting.
Here is the default:
ggplot(chic, aes(x = date, y = o3)) +
geom_line(color = "gray") +
geom_point(color = "darkorange2") +
labs(x = "Year", y = "Ozone")We can force a legend by mapping a guide to a variable. We are mapping the lines and the points using aes() and we are mapping not to a variable in our dataset but to a single string (so that we get just one color for each).
ggplot(chic, aes(x = date, y = o3)) +
geom_line(aes(color = "line")) +
geom_point(aes(color = "points")) +
labs(x = "Year", y = "Ozone") +
scale_color_discrete("Type:")Not yet what we want!
We want gray and red! To change the color, we use scale_color_manual(). Additionally, we override the legend aesthetics using the guide() function.
Voila! Now, we have a plot with gray lines and red pints as well as a single gray line and a single red point as legend symbols:
ggplot(chic, aes( x= date, y = o3)) +
geom_line(aes(color = "line")) +
geom_point(aes(color = "points")) +
labs(x = "Year", y = "Ozone") +
scale_color_manual(name = NULL,
guide = "legend",
values = c("points" = "darkorange2",
"line" = "gray")) +
guides(color = guide_legend(override.aes = list(linetype = c(1, 0),
shape = c(NA, 16))))The default legend for categorical variables such as season is a guide_legend() as you have seen in several previous examples. If you map a continuous variable to an aesthetic, {ggplot2} will by default not use guide_legend() but guide_colorbar() (or guide_colourbar()):
ggplot(chic,
aes(x = date, y = temp, color = temp)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)", color = "Temperature (°F)")However, by using guide_legend() you can force the legend to show discrete colors for a given number of breaks as in case of a categorical variable:
ggplot(chic,
aes(x = date, y = temp, color = temp)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)", color = "Temperature (°F)") +
guides(color = guide_legend())You can also use binned scales:
ggplot(chic,
aes(x = date, y = temp, color = temp)) +
geom_point() +
labs(x = "Year", y = "Temperature (°F)", color = "Temperature (°F)") +
guides(color = guide_bins())There are ways to change the entire look of your plot with one function (see “Working with Themes” section below) but if you want to simply change the colors of some elements, you can also do that.
There are two types of grid lines: major grid lines indicating the ticks and minor grid lines between the major ones. You can change all of these by overwriting the defaults for panel.grid or for each set of gridlines separately panel.grid.major and panel.grid.minor.
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.background = element_rect(fill = "gray90"),
panel.grid.major = element_line(color = "gray10", size = .5),
panel.grid.minor = element_line(color = "gray70", size = .25))You can even specify settings for all four different levels:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.background = element_rect(fill = "gray90"),
panel.grid.major = element_line(size = .5, linetype = "dashed"),
panel.grid.minor = element_line(size = .25, linetype = "dotted"),
panel.grid.major.x = element_line(color = "red1"),
panel.grid.major.y = element_line(color = "blue1"),
panel.grid.minor.x = element_line(color = "red4"),
panel.grid.minor.y = element_line(color = "blue4"))And, of course, you can remove some or all grid lines if you like:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.grid.minor = element_blank())ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.grid = element_blank())Furthermore, you can also define the breaks between both, major and minor grid lines:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
scale_y_continuous(breaks = seq(0, 100, 10),
minor_breaks = seq(0, 100, 2.5))To change the background color (fill) of the panel area (i.e. the area where the data is plotted), one needs to adjust the theme element panel.background:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "#1D8565", size = 2) +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.background = element_rect(fill = "#64D2AA",
color = "#64D2AA", size = 2))Note that the true color—the outline of the panel background—did not change even though we specified it. This is because there is a layer on top of the panel.background, namely panel.border. However, make sure to use a transparent fill here, otherwise your data is hidden behind this layer. In the following example, I illustrate that by using a semitransparent hex color for the fill argument in element_rect:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "#1D8565", size = 2) +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.border = element_rect(fill = "#64D2AA99",
color = "#64D2AA", size = 2))Similarly, to change the background color (fill) of the plot area, one needs to modify the theme element plot.background:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(plot.background = element_rect(fill = "gray60",
color = "gray30", size = 2))You can achieve a unique background color by either setting the same colors in both panel.background and plot.background or by setting the background filling of the panel to "transparent" or NA:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(panel.background = element_rect(fill = NA),
plot.background = element_rect(fill = "gray60",
color = "gray30", size = 2)) # Working with Margins Sometimes it is useful to add a little space to the plot margin. Similar to the previous examples we can use an argument to the
theme() function. In this case the argument is plot.margin. As In the previous example we already illustrated the default margin by changing the background color using plot.background.
plot.margin, can handle a variety of different units (cm, inches, etc.) but it requires the use of the function unit from the package grid to specify the units. Here I am using a 5 cm margin on the right and left.ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "firebrick") +
labs(x = "Year", y = "Temperature (°F)") +
theme(plot.background = element_rect(fill = "gray60"),
plot.margin = unit(c(1, 3, 1, 8), "cm")) The order of the margin sides is top, right, bottom, left—a nice way to remember this order is trouble that sorts the first letter of the four sides.
The {ggplot2} package has two nice functions for creating multi-panel plots, called facets. They are related but a little different: facet_wrap creates essentially a ribbon of plots based on a single variable while facet_grid spans a grid of two variables.
facet_wrap creates a facet of a single variable, written with a tilde in front: facet_wrap(~ variable). The appearance of these subplots is controlled by the arguments ncol and nrow:
g <- ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "chartreuse4", alpha = .3) +
labs(x = "Year", y = "Temperature (°F)") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
g + facet_wrap(~ year, nrow = 1)Accordingly, you can arrange the plots as you like as a matrix…
… or even as a asymmetric grid of plots:
The default for multi-panel plots in {ggplot2} is to use equivalent scales in each panel. But sometimes you want to allow a panels own data to determine the scale. This is often not a good idea since it may give your user the wrong impression about the data. But sometimes it is indeed useful and to do this you can set scales = "free":
Note that both, x and y axes differ in their range!
In case of two variables, facet_grid does the job. Here, the order of the variables determines the number of rows and columns:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "orangered", alpha = .3) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
labs(x = "Year", y = "Temperature (°F)") +
facet_grid(year ~ season)To change from row to column arrangement you can change facet_grid(year ~ season) to facet_grid(season ~ year).
The function facet_wrap can also take two variables and you are still able to control the grid design:
By using theme, you can modify the appearance of the strip text(i.e. the title for each facet) and the strip text boxes:
g + facet_wrap(~ year, nrow = 1, scales = "free_x") +
theme(strip.text = element_text(face = "bold", color = "chartreuse4",
hjust = 0, size = 20),
strip.background = element_rect(fill = "chartreuse3", linetype = "dotted"))The following two functions allow to highlight specific labels in combination with element_textbox() that is provided by {ggtext}.
##
## Attaching package: 'rlang'
## The following objects are masked from 'package:purrr':
##
## %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
## flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
## splice
element_textbox_highlight <- function(..., hi.labels = NULL, hi.fill = NULL,
hi.col = NULL, hi.box.col = NULL, hi.family = NULL) {
structure(
c(element_textbox(...),
list(hi.labels = hi.labels, hi.fill = hi.fill, hi.col = hi.col, hi.box.col = hi.box.col, hi.family = hi.family)
),
class = c("element_textbox_highlight", "element_textbox", "element_text", "element")
)
}
element_grob.element_textbox_highlight <- function(element, label = "", ...) {
if (label %in% element$hi.labels) {
element$fill <- element$hi.fill %||% element$fill
element$colour <- element$hi.col %||% element$colour
element$box.colour <- element$hi.box.col %||% element$box.colour
element$family <- element$hi.family %||% element$family
}
NextMethod()
}Now you can use it and specify for example all striptexts showing year:
g + facet_wrap(year ~ season, nrow = 4, scales = "free_x") +
theme(
strip.background = element_blank(),
strip.text = element_textbox_highlight(
family = "Playfair", size = 12, face = "bold",
fill = "white", box.color = "chartreuse4", color = "chartreuse4",
halign = .5, linetype = 1, r = unit(5, "pt"), width = unit(1, "npc"),
padding = margin(5, 0, 3, 0), margin = margin(0, 1, 3, 1),
hi.labels = c("1997", "1998", "1999", "2000"),
hi.fill = "chartreuse4", hi.box.col = "black", hi.col = "white"
)
)ggplot(chic, aes(x = date, y = temp)) +
geom_point(aes(color = season == "Summer"), alpha = .3) +
labs(x = "Year", y = "Temperature (°F)") +
facet_wrap(~ season, nrow = 1) +
scale_color_manual(values = c("gray40", "firebrick"), guide = "none") +
theme(
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
strip.background = element_blank(),
strip.text = element_textbox_highlight(
size = 12, face = "bold",
fill = "white", box.color = "white", color = "gray40",
halign = .5, linetype = 1, r = unit(0, "pt"), width = unit(1, "npc"),
padding = margin(2, 0, 1, 0), margin = margin(0, 1, 3, 1),
hi.labels = "Summer", hi.family = "Bangers",
hi.fill = "firebrick", hi.box.col = "firebrick", hi.col = "white"
)
)There are several ways how plots can be combined. The easiest approach in my opinion is the {patchwork} package by Thomas Lin Pedersen:
p1 <- ggplot(chic, aes(x = date, y = temp,
color = season)) +
geom_point() +
geom_rug() +
labs(x = "Year", y = "Temperature (°F)")
p2 <- ggplot(chic, aes(x = date, y = o3)) +
geom_line(color = "gray") +
geom_point(color = "darkorange2") +
labs(x = "Year", y = "Ozone")
library(patchwork)
p1 + p2We can change the order by “dividing” both plots (and note the alignment even though one has a legend and one doesn’t!):
And also nested plots are possible!
(Note the alignment of the plots even though only one row contains a legend.)
Alternatively, the {cowplot} package provides the functionality to combine multiple plots (and lots of other good utilities):
##
## Attaching package: 'cowplot'
## The following object is masked from 'package:patchwork':
##
## align_plots
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family
## 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family 'Roboto Condensed' not found in PostScript font database
… and so does the {gridExtra} package as well:
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
The same idea of defining a layout can be used with {patchwork} as well which allows to create complex compositions:
For simple applications working with colors is straightforward in {ggplot2}. For a more advanced treatment of the topic you should probably get your hands on Hadley’s book which has nice coverage. Other good sources are the R Cookbook and the `color section in the R Graph Gallery by Yan Holtz.
There are two main differences when it comes to colors in {ggplot2}. Both arguments, color and fill, can be
As you have already seen in the beginning of this tutorial, variables that are inside the aesthetics are encoded by variables and those that are outside are properties that are unrelated to the variables. This complete nonsense plot showing the number of records per year and season illustrates that fact:
ggplot(chic, aes(year)) +
geom_bar(aes(fill = season), color = "grey", size = 2) +
labs(x = "Year", y = "Observations", fill = "Season:")Static, single colors are simple to use. We can specify a single color for a geom:
ggplot(chic, aes(x = date, y = temp)) +
geom_point(color = "steelblue", size = 2) +
labs(x = "Year", y = "Temperature (°F)") … and in case it provides both, a color (outline color) and a fill (filling color):
ggplot(chic, aes(x = date, y = temp)) +
geom_point(shape = 21, size = 2, stroke = 1,
color = "#3cc08f", fill = "#c08f3c") +
labs(x = "Year", y = "Temperature (°F)")Tian Zheng at Columbia has created a useful PDF of R colors. Of course, you can also specify hex color codes (simply as strings as in the example above) as well as RGB or RGBA values (via the rgb() function: rgb(red, green, blue, alpha)).
In {ggplot2}, colors that are assigned to variables are modified via the scale_color_* and the scale_fill_* functions. In order to use color with your data, most importantly you need to know if you are dealing with a categorical or continuous variable. The color palette should be chosen depending on type of the variable, with sequential or diverging color palettes being used for continuous variables and qualitative color palettes for categorical variables:
Image source: “Hands-On Data Visualization” by Jack Dougherty & Ilya Ilyankou
Qualitative or categorical variables represent types of data which can be divided into groups (categories). The variable can be further specified as nominal, ordinal, and binary (dichotomous). Examples of qualitative/categorical variables are: